Skip to content

TTS/STT: Integrated Sarvam STT and TTS#621

Merged
Prajna1999 merged 15 commits intomainfrom
feature/stt-sarvam
Mar 5, 2026
Merged

TTS/STT: Integrated Sarvam STT and TTS#621
Prajna1999 merged 15 commits intomainfrom
feature/stt-sarvam

Conversation

@rajagopalmotivate
Copy link
Collaborator

@rajagopalmotivate rajagopalmotivate commented Feb 23, 2026

Summary

Target issue is #564
Explain the motivation for making this change. What existing problem does the pull request solve?
Added provider sarvam speech to text

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Summary by CodeRabbit

  • New Features

    • Added SarvamAI as a new provider offering speech-to-text and text-to-speech capabilities.
    • Made the instructions parameter optional for speech-to-text requests for greater flexibility.
  • Tests

    • Added end-to-end test coverage and sample speech data to validate the new provider and transcription flows.

@coderabbitai
Copy link

coderabbitai bot commented Feb 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds SarvamAI provider support: new provider enum/config, model updates to accept sarvam variants, SarvamAIProvider implementation (STT/TTS), registry registration, Kaapi→Sarvam mapping, tests with embedded audio sample, and a dependency on sarvamai.

Changes

Cohort / File(s) Summary
Provider core & models
backend/app/core/providers.py, backend/app/models/llm/request.py, backend/app/services/llm/providers/registry.py
Added SARVAMAI provider entry and provider config; made STTLLMParams.instructions optional; extended NativeCompletionConfig/KaapiCompletionConfig provider Literals to include Sarvam variants; registered SARVAMAI_NATIVE in provider registry.
SarvamAI provider implementation
backend/app/services/llm/providers/sai.py
New SarvamAIProvider class implementing client creation, input parsing, STT and TTS execution, response/usage construction, and error handling consistent with other providers.
Parameter mapping / config transform
backend/app/services/llm/mappers.py
Added map_kaapi_to_sarvam_params and updated transform_kaapi_config_to_native to produce sarvamai-native native configs and emit warnings for unsupported Kaapi params when mapping.
Tests & test data
backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py, backend/app/tests/services/llm/providers/STTproviders/test_data_speechsamples.py
Added end-to-end-style test script wiring provider registry, client creation, temporary WAV from embedded base64 Tamil audio (mydata), and execution/demo flow for STT.
Dependency
backend/pyproject.toml
Added sarvamai>=0.1.25 to project dependencies.

Sequence Diagram

sequenceDiagram
    participant Client
    participant SarvamAIProvider
    participant FileSystem
    participant SarvamAIClient
    participant Response

    Client->>SarvamAIProvider: execute(completion_config, query, resolved_input)
    SarvamAIProvider->>SarvamAIProvider: _parse_input(resolved_input, type)
    SarvamAIProvider->>FileSystem: open/read audio file (if STT)
    FileSystem-->>SarvamAIProvider: audio bytes
    SarvamAIProvider->>SarvamAIClient: speech_to_text.transcribe(model, language_code, mode)
    SarvamAIClient-->>SarvamAIProvider: transcript + raw_response
    SarvamAIProvider->>SarvamAIProvider: build LLMCallResponse (output, usage, optional raw)
    SarvamAIProvider-->>Response: (LLMCallResponse, None) or (None, error_message)
    Response-->>Client: return result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • kartpop
  • vprashrex

Poem

🐰 A fresh voice hops into the stack,
Sarvam listens, then writes it back.
Mappers twirl and providers sing,
From whispered speech to digital string —
Hooray! New audio magic in the pack. 🎶

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'TTS/STT: Integrated Sarvam STT and TTS' accurately reflects the primary changes: adding both Sarvam STT and TTS provider support to the backend.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/stt-sarvam

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py (2)

41-107: Avoid duplicating provider registry logic in tests.
Consider importing LLMProvider/get_llm_provider from the production registry module instead of re‑implementing it here to prevent drift and keep business logic in services.

As per coding guidelines: backend/app/services/**/*.py: Implement business logic in services located in backend/app/services/.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py`
around lines 41 - 107, This test duplicates the production provider registry and
factory (LLMProvider and get_llm_provider); instead of re-implementing them,
remove the LLMProvider class and get_llm_provider function from the test and
import the existing implementations from the production services module (the
module that defines LLMProvider/get_llm_provider and BaseProvider), then update
the test to call the imported LLMProvider.get_provider_class and
get_llm_provider directly (so the test uses the canonical _registry and
credential handling rather than a copy).

110-170: Convert the ad‑hoc __main__ harness into a pytest test with fixtures/factory (or move to scripts).
This keeps tests discoverable and aligns with the fixture/factory pattern expected in the tests package.

As per coding guidelines: backend/app/tests/**/*.py: Use factory pattern for test fixtures in backend/app/tests/.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py`
around lines 110 - 170, The ad-hoc __main__ harness should be converted into a
pytest test using fixtures/factories: replace the top-level script with a test
function (e.g., test_sarvam_stt_provider) that uses fixtures to provide SARVAM
credentials, a temp file (tmp_path or tmp_path_factory) for the WAV bytes
(mydata), and a registry fixture that ensures LLMProvider._registry contains
SarvamAIProvider; inside the test, call
LLMProvider.get_provider_class("sarvamai-native"), create the client via
ProviderClass.create_client(credentials=mock_credentials), instantiate
ProviderClass(client=client), build NativeCompletionConfig and QueryParams,
write mydata to the tmp file path and call
instance.execute(completion_config=..., query=...,
resolved_input=temp_audio_path), then assert expected result/error;
alternatively move the harness to a scripts/ integration script if it’s not a
unit test. Ensure fixtures are placed in tests/conftest or use existing factory
helpers rather than hard-coding credentials or file handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/services/llm/providers/sai.py`:
- Around line 9-18: The import list in the module imports TextOutput twice;
remove the duplicate TextOutput entry from the import tuple (retain one
occurrence alongside NativeCompletionConfig, LLMCallResponse, QueryParams,
TextOutput, LLMResponse, Usage, TextContent) so the import statement in
app.models.llm no longer contains repeated symbols.
- Around line 25-33: The __init__ method of SarvamAIProvider lacks an explicit
return type annotation; update the signature of SarvamAIProvider.__init__ to
include the return type "-> None" (keeping the existing parameter type hint for
client: SarvamAI) so it conforms to the project's type-hinting rules.
- Around line 117-124: Prefix both log messages in the _execute_stt handler with
the function name in square brackets and mask sensitive values: update the
logger.info call that logs sarvam_response.request_id to include
"[_execute_stt]" at the start and use mask_string(sarvam_response.request_id)
instead of the raw id, and update the logger.error call to similarly prefix
"[_execute_stt]" before the error_message (keeping exc_info=True).

In
`@backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py`:
- Line 193: Add a trailing newline at the end of the test file
backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py
so the file ends with a single newline character (ensure the EOF contains "\n");
this fixes the missing newline at EOF and satisfies POSIX/linters.
- Line 174: The print statement print(f"\n--- SarvamAI STT Result ---") uses an
unnecessary f-string; remove the f prefix and change it to print("\n--- SarvamAI
STT Result ---") so the literal is printed without treating it as a formatted
string.
- Around line 105-106: Update the logger.error call that currently reads
logger.error(f"Failed to initialize {provider_type} client: {e}", exc_info=True)
to prefix the message with the current function name in square brackets (e.g.,
logger.error(f"[{function_name}] Failed to initialize {provider_type} client:
{mask_string(e)}", exc_info=True)); keep exc_info=True and use mask_string for
any sensitive values; change only the logger.error invocation (the subsequent
raise RuntimeError can remain unchanged).
- Around line 19-23: The test file contains duplicate import statements (e.g.,
repeated "import os" and "import tempfile"); remove any repeated import lines so
each module is imported only once in test_STT_SarvamProvider.py, keep the needed
imports (os, tempfile) and delete the redundant/commented duplicates (and any
stray "temporary import" placeholder), then run the linter to ensure imports are
clean and sorted.

---

Nitpick comments:
In
`@backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py`:
- Around line 41-107: This test duplicates the production provider registry and
factory (LLMProvider and get_llm_provider); instead of re-implementing them,
remove the LLMProvider class and get_llm_provider function from the test and
import the existing implementations from the production services module (the
module that defines LLMProvider/get_llm_provider and BaseProvider), then update
the test to call the imported LLMProvider.get_provider_class and
get_llm_provider directly (so the test uses the canonical _registry and
credential handling rather than a copy).
- Around line 110-170: The ad-hoc __main__ harness should be converted into a
pytest test using fixtures/factories: replace the top-level script with a test
function (e.g., test_sarvam_stt_provider) that uses fixtures to provide SARVAM
credentials, a temp file (tmp_path or tmp_path_factory) for the WAV bytes
(mydata), and a registry fixture that ensures LLMProvider._registry contains
SarvamAIProvider; inside the test, call
LLMProvider.get_provider_class("sarvamai-native"), create the client via
ProviderClass.create_client(credentials=mock_credentials), instantiate
ProviderClass(client=client), build NativeCompletionConfig and QueryParams,
write mydata to the tmp file path and call
instance.execute(completion_config=..., query=...,
resolved_input=temp_audio_path), then assert expected result/error;
alternatively move the harness to a scripts/ integration script if it’s not a
unit test. Ensure fixtures are placed in tests/conftest or use existing factory
helpers rather than hard-coding credentials or file handling.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c77e17 and 99fffab.

⛔ Files ignored due to path filters (1)
  • backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • backend/app/core/providers.py
  • backend/app/models/llm/request.py
  • backend/app/services/llm/providers/registry.py
  • backend/app/services/llm/providers/sai.py
  • backend/app/services/llm/providers/tests_data.py
  • backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py
  • backend/app/tests/services/llm/providers/STTproviders/test_data_speechsamples.py
  • backend/dump.rdb
  • backend/pyproject.toml

Comment on lines +25 to +33
class SarvamAIProvider(BaseProvider):
def __init__(self, client: SarvamAI):
"""Initialize SarvamAI provider with client.

Args:
client: SarvamAI client instance
"""
super().__init__(client)
self.client = client
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add explicit return type for __init__.

🔧 Proposed fix
-    def __init__(self, client: SarvamAI):
+    def __init__(self, client: SarvamAI) -> None:
As per coding guidelines: `**/*.py`: Always add type hints to all function parameters and return values in Python code.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class SarvamAIProvider(BaseProvider):
def __init__(self, client: SarvamAI):
"""Initialize SarvamAI provider with client.
Args:
client: SarvamAI client instance
"""
super().__init__(client)
self.client = client
class SarvamAIProvider(BaseProvider):
def __init__(self, client: SarvamAI) -> None:
"""Initialize SarvamAI provider with client.
Args:
client: SarvamAI client instance
"""
super().__init__(client)
self.client = client
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/providers/sai.py` around lines 25 - 33, The __init__
method of SarvamAIProvider lacks an explicit return type annotation; update the
signature of SarvamAIProvider.__init__ to include the return type "-> None"
(keeping the existing parameter type hint for client: SarvamAI) so it conforms
to the project's type-hinting rules.

Comment on lines +117 to +124
logger.info(
f"[{provider_name}.execute_stt] Successfully transcribed audio: {sarvam_response.request_id}"
)
return llm_response, None

except Exception as e:
error_message = f"SarvamAI STT transcription failed: {str(e)}"
logger.error(f"[{provider_name}.execute_stt] {error_message}", exc_info=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Prefix _execute_stt logs with the function name.

🔧 Proposed fix
-            logger.info(
-                f"[{provider_name}.execute_stt] Successfully transcribed audio: {sarvam_response.request_id}"
-            )
+            logger.info(
+                f"[SarvamAIProvider._execute_stt] Successfully transcribed audio: {sarvam_response.request_id}"
+            )
@@
-            logger.error(f"[{provider_name}.execute_stt] {error_message}", exc_info=True)
+            logger.error(
+                f"[SarvamAIProvider._execute_stt] {error_message}",
+                exc_info=True,
+            )
As per coding guidelines: Prefix all log messages with the function name in square brackets: `logger.info(f"[function_name] Message {mask_string(sensitive_value)}")`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/providers/sai.py` around lines 117 - 124, Prefix
both log messages in the _execute_stt handler with the function name in square
brackets and mask sensitive values: update the logger.info call that logs
sarvam_response.request_id to include "[_execute_stt]" at the start and use
mask_string(sarvam_response.request_id) instead of the raw id, and update the
logger.error call to similarly prefix "[_execute_stt]" before the error_message
(keeping exc_info=True).

Comment on lines +105 to +106
logger.error(f"Failed to initialize {provider_type} client: {e}", exc_info=True)
raise RuntimeError(f"Could not connect to {provider_type} services.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Prefix error log with the function name.

🔧 Proposed fix
-        logger.error(f"Failed to initialize {provider_type} client: {e}", exc_info=True)
+        logger.error(
+            f"[get_llm_provider] Failed to initialize {provider_type} client: {e}",
+            exc_info=True,
+        )
As per coding guidelines: Prefix all log messages with the function name in square brackets: `logger.info(f"[function_name] Message {mask_string(sensitive_value)}")`.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.error(f"Failed to initialize {provider_type} client: {e}", exc_info=True)
raise RuntimeError(f"Could not connect to {provider_type} services.")
logger.error(
f"[get_llm_provider] Failed to initialize {provider_type} client: {e}",
exc_info=True,
)
raise RuntimeError(f"Could not connect to {provider_type} services.")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py`
around lines 105 - 106, Update the logger.error call that currently reads
logger.error(f"Failed to initialize {provider_type} client: {e}", exc_info=True)
to prefix the message with the current function name in square brackets (e.g.,
logger.error(f"[{function_name}] Failed to initialize {provider_type} client:
{mask_string(e)}", exc_info=True)); keep exc_info=True and use mask_string for
any sensitive values; change only the logger.error invocation (the subsequent
raise RuntimeError can remain unchanged).

if temp_audio_file_path and os.path.exists(temp_audio_file_path):
os.remove(temp_audio_file_path)
print(f"Cleaned up temporary file: {temp_audio_file_path}")

No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a trailing newline at EOF.

🧰 Tools
🪛 Ruff (0.15.2)

[warning] 193-193: No newline at end of file

Add trailing newline

(W292)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py`
at line 193, Add a trailing newline at the end of the test file
backend/app/tests/services/llm/providers/STTproviders/test_STT_SarvamProvider.py
so the file ends with a single newline character (ensure the EOF contains "\n");
this fixes the missing newline at EOF and satisfies POSIX/linters.

if not model:
return None, "Missing 'model' in native params for SarvamAI STT"

inputlanguageofaudio = generation_params.get("input_language")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write better variable name

Copy link
Collaborator

@Prajna1999 Prajna1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a KaapiLLMParam to SarvamLLMParam mapper function inside mappers.py. Approving for fastracking, make sure the function work with pass-through Sarvam parameters as well. Basic cleanups of test scripts is required.

sarvam_response = self.client.speech_to_text.transcribe(
file=audio_file,
model=model,
# SarvamAI's flagship STT model Saarika supports mixed language content with automatic detection of languages within the sentance
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments unnecessary

# SarvamAI does not provide token usage directly for STT, so we'll use placeholders
# You might estimate based on transcript length or set to 0
input_tokens_estimate = 0 # Not directly provided by SarvamAI STT
output_tokens_estimate = len(sarvam_response.transcript.split()) # Estimate by word count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the best way to count tokens. But if no other way is available then good to go.

# 1. Simulate environment/credentials
# SARVAM_API_KEY is already defined in the notebook
SARVAM_API_KEY = "" # for testing only

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is redundant

import os
from dotenv import load_dotenv
import logging

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name should be test_sarvam_provider.py as it will contain TTS testing scripts as well

@Prajna1999
Copy link
Collaborator

Prajna1999 commented Feb 28, 2026

@rajagopalmotivate please make the changes on priority as SARVAM is needed in our unified API.

cc @kartpop

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/pyproject.toml`:
- Line 40: The dependency entry "sarvamai>=0.1.25" is unbounded and should
follow the project's pre-1.0 bounding policy; update the dependency to constrain
upgrades (e.g., change the requirement expression for sarvamai to include an
upper bound like <0.2.0) so the package spec for sarvamai becomes a bounded
pre-1.0 range (retain the minimum 0.1.25 while adding the <0.2.0 cap).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 54cd43b and b99a0c0.

⛔ Files ignored due to path filters (1)
  • backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • backend/app/models/llm/request.py
  • backend/app/services/llm/providers/registry.py
  • backend/pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/app/services/llm/providers/registry.py
  • backend/app/models/llm/request.py

"redis>=5.0.0,<6.0.0",
"flower>=2.0.1",
"google-genai>=1.59.0",
"sarvamai>=0.1.25",
Copy link

@coderabbitai coderabbitai bot Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Dependency declaration ==="
rg -n 'sarvamai' backend/pyproject.toml 2>/dev/null || echo "No sarvamai entry found"

echo ""
echo "=== Checking for lockfile ==="
fd '^uv\.lock$|^poetry\.lock$|^requirements.*\.txt$' -H backend/ 2>/dev/null || echo "No standard lockfile found in backend/"

echo ""
echo "=== Checking project dependencies structure ==="
head -50 backend/pyproject.toml | grep -A 20 "dependencies\|requires"

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 1025


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== PyPI version information for sarvamai ==="
curl -s https://pypi.org/pypi/sarvamai/json 2>/dev/null | jq -r '.releases | keys[]' | sort -V 2>/dev/null | tail -n 30

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 383


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Detailed version info for sarvamai ==="
curl -s https://pypi.org/pypi/sarvamai/json 2>/dev/null | jq '.info | {version, author, home_page, summary}' || echo "Failed to fetch PyPI info"

Repository: ProjectTech4DevAI/kaapi-backend

Length of output: 201


Constrain SarvamAI to a bounded pre-1.0 range.

At Line 40, sarvamai>=0.1.25 is open-ended. The project consistently bounds pre-1.0 packages (see sqlmodel<1.0.0 and email-validator<3.0.0.0), but sarvamai lacks an upper bound. Consider capping it to <0.2.0 to match the project's dependency strategy and reduce upgrade risk if a breaking change is introduced in a future 0.x release.

Proposed change
-    "sarvamai>=0.1.25",
+    "sarvamai>=0.1.25,<0.2.0",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"sarvamai>=0.1.25",
"sarvamai>=0.1.25,<0.2.0",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/pyproject.toml` at line 40, The dependency entry "sarvamai>=0.1.25"
is unbounded and should follow the project's pre-1.0 bounding policy; update the
dependency to constrain upgrades (e.g., change the requirement expression for
sarvamai to include an upper bound like <0.2.0) so the package spec for sarvamai
becomes a bounded pre-1.0 range (retain the minimum 0.1.25 while adding the
<0.2.0 cap).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false alarm

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Prajna1999, understood! I'll disregard that suggestion. Thanks for clarifying.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
backend/app/services/llm/providers/sai.py (3)

25-25: ⚠️ Potential issue | 🟡 Minor

Add explicit -> None return type to __init__.

Line 25 should declare an explicit return type for consistency with repository typing rules.

🔧 Proposed fix
-    def __init__(self, client: SarvamAI):
+    def __init__(self, client: SarvamAI) -> None:

As per coding guidelines: **/*.py: Always add type hints to all function parameters and return values in Python code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/providers/sai.py` at line 25, Add an explicit return
type annotation to the class constructor by changing the signature of __init__
in backend/app/services/llm/providers/sai.py so it declares a None return (i.e.,
def __init__(self, client: SarvamAI) -> None:); update only the signature of the
__init__ method (the constructor that accepts client: SarvamAI) to satisfy the
repository typing rule.

8-17: ⚠️ Potential issue | 🟡 Minor

Remove duplicate TextOutput import.

TextOutput is imported twice, which triggers redefinition/lint errors and can hide real import issues.

🧹 Proposed fix
 from app.models.llm import (
     NativeCompletionConfig,
     LLMCallResponse,
     QueryParams,
     TextOutput,
     LLMResponse,
     Usage,
-    TextOutput,
     TextContent,
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/providers/sai.py` around lines 8 - 17, The import
list in sai.py contains a duplicated symbol TextOutput which causes
lint/redefinition errors; remove the duplicate TextOutput entry from the from
app.models.llm import (...) tuple so each symbol (e.g., NativeCompletionConfig,
LLMCallResponse, QueryParams, TextOutput, LLMResponse, Usage, TextContent) is
only imported once and keep the remaining import names unchanged and properly
ordered.

122-125: ⚠️ Potential issue | 🟡 Minor

Mask provider request IDs before logging.

Line 124 logs raw request_id; mask it before writing logs.

🔒 Proposed fix
+from app.core.util import mask_string
...
             logger.info(
                 f"[_execute_stt] Successfully transcribed audio | "
-                f"request_id={sarvam_response.request_id}, model={model}, mode={mode}"
+                f"request_id={mask_string(str(sarvam_response.request_id))}, model={model}, mode={mode}"
             )

As per coding guidelines: Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/providers/sai.py` around lines 122 - 125, In
_execute_stt replace direct logging of sarvam_response.request_id with a masked
value: call the existing mask_string(sarvam_response.request_id) (or implement
mask_string if missing) and pass that into logger.info so the log becomes
logger.info(f"[ _execute_stt ] Successfully transcribed audio |
request_id={mask_string(sarvam_response.request_id)}, model={model},
mode={mode}"); ensure you reference the sarvam_response.request_id symbol and
the logger.info call so the raw ID is never written to logs.
🧹 Nitpick comments (2)
backend/app/services/llm/mappers.py (1)

160-163: Normalize missing input_language to auto-detect.

When input_language is omitted, language_code becomes None. Treating missing and "auto" consistently avoids ambiguous downstream params.

♻️ Proposed fix
-    if input_language == "auto":
+    if input_language in (None, "auto"):
         sarvam_params["language_code"] = "unknown"
     else:
         sarvam_params["language_code"] = input_language
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/mappers.py` around lines 160 - 163, The logic sets
sarvam_params["language_code"] to "unknown" only when input_language == "auto",
but if input_language is missing/None it currently becomes None; update the
condition to treat missing/None the same as "auto" by checking for falsy or
explicit None (e.g., if input_language is None or input_language == "auto") so
sarvam_params["language_code"] is set to "unknown" in both cases; modify the
branch around the input_language assignment (where sarvam_params and
input_language are used) to normalize omitted input_language to auto-detect.
backend/app/services/llm/providers/sai.py (1)

44-45: Use os.path.isfile for STT path validation.

os.path.exists also returns true for directories, which can cause avoidable failures when opening the path as a file.

♻️ Proposed fix
-            if isinstance(query_input, str) and os.path.exists(query_input):
+            if isinstance(query_input, str) and os.path.isfile(query_input):
                 return query_input
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/providers/sai.py` around lines 44 - 45, The STT path
check currently uses os.path.exists which returns True for directories; update
the validation to use os.path.isfile so only actual files pass: in the branch
that tests isinstance(query_input, str) and os.path.exists(query_input) (the
code handling query_input for STT input), replace os.path.exists(query_input)
with os.path.isfile(query_input) so directories won't be treated as valid file
paths when returning query_input.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/services/llm/mappers.py`:
- Around line 174-185: The warnings code in mappers.py uses a malformed string
for the instructions message and skips warning for temperature when
temperature==0.0; update the instructions warning to "Parameter 'instructions'
is not supported by Sarvam AI and was ignored" (fix the missing quote) and
change the temperature check from truthy "if temperature:" to an explicit
presence/None check (e.g., check "if 'temperature' in kaapi_params" or "if
temperature is not None") so a value of 0.0 still triggers a warning; operate on
the same variables names (kaapi_params, instructions, temperature, warnings) in
the mapper function.

---

Duplicate comments:
In `@backend/app/services/llm/providers/sai.py`:
- Line 25: Add an explicit return type annotation to the class constructor by
changing the signature of __init__ in backend/app/services/llm/providers/sai.py
so it declares a None return (i.e., def __init__(self, client: SarvamAI) ->
None:); update only the signature of the __init__ method (the constructor that
accepts client: SarvamAI) to satisfy the repository typing rule.
- Around line 8-17: The import list in sai.py contains a duplicated symbol
TextOutput which causes lint/redefinition errors; remove the duplicate
TextOutput entry from the from app.models.llm import (...) tuple so each symbol
(e.g., NativeCompletionConfig, LLMCallResponse, QueryParams, TextOutput,
LLMResponse, Usage, TextContent) is only imported once and keep the remaining
import names unchanged and properly ordered.
- Around line 122-125: In _execute_stt replace direct logging of
sarvam_response.request_id with a masked value: call the existing
mask_string(sarvam_response.request_id) (or implement mask_string if missing)
and pass that into logger.info so the log becomes logger.info(f"[ _execute_stt ]
Successfully transcribed audio |
request_id={mask_string(sarvam_response.request_id)}, model={model},
mode={mode}"); ensure you reference the sarvam_response.request_id symbol and
the logger.info call so the raw ID is never written to logs.

---

Nitpick comments:
In `@backend/app/services/llm/mappers.py`:
- Around line 160-163: The logic sets sarvam_params["language_code"] to
"unknown" only when input_language == "auto", but if input_language is
missing/None it currently becomes None; update the condition to treat
missing/None the same as "auto" by checking for falsy or explicit None (e.g., if
input_language is None or input_language == "auto") so
sarvam_params["language_code"] is set to "unknown" in both cases; modify the
branch around the input_language assignment (where sarvam_params and
input_language are used) to normalize omitted input_language to auto-detect.

In `@backend/app/services/llm/providers/sai.py`:
- Around line 44-45: The STT path check currently uses os.path.exists which
returns True for directories; update the validation to use os.path.isfile so
only actual files pass: in the branch that tests isinstance(query_input, str)
and os.path.exists(query_input) (the code handling query_input for STT input),
replace os.path.exists(query_input) with os.path.isfile(query_input) so
directories won't be treated as valid file paths when returning query_input.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b99a0c0 and 318a94c.

📒 Files selected for processing (3)
  • backend/app/models/llm/request.py
  • backend/app/services/llm/mappers.py
  • backend/app/services/llm/providers/sai.py

Comment on lines +174 to +185
instructions = kaapi_params.get("instructions")
if instructions:
warnings.append(
"Parameter 'instructions is not supported by Sarvam AI and was ignored"
)

temperature = kaapi_params.get("temperature")

if temperature:
warnings.append(
"Parameter 'temperature' is not supported by Sarvam AI and was ignored"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix warning text and temperature=0.0 edge case.

Line 177 warning text is malformed, and Line 182 skips warnings when temperature is explicitly 0.0.

🛠️ Proposed fix
     instructions = kaapi_params.get("instructions")
     if instructions:
         warnings.append(
-            "Parameter 'instructions is not supported by Sarvam AI and was ignored"
+            "Parameter 'instructions' is not supported by Sarvam AI and was ignored."
         )
...
-    if temperature:
+    if temperature is not None:
         warnings.append(
-            "Parameter 'temperature' is not supported by Sarvam AI and was ignored"
+            "Parameter 'temperature' is not supported by Sarvam AI and was ignored."
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/mappers.py` around lines 174 - 185, The warnings
code in mappers.py uses a malformed string for the instructions message and
skips warning for temperature when temperature==0.0; update the instructions
warning to "Parameter 'instructions' is not supported by Sarvam AI and was
ignored" (fix the missing quote) and change the temperature check from truthy
"if temperature:" to an explicit presence/None check (e.g., check "if
'temperature' in kaapi_params" or "if temperature is not None") so a value of
0.0 still triggers a warning; operate on the same variables names (kaapi_params,
instructions, temperature, warnings) in the mapper function.

@Prajna1999 Prajna1999 changed the title STT: Integrated Sarvam STT TTS/STT: Integrated Sarvam STT and TTS Mar 3, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
backend/app/services/llm/mappers.py (1)

191-207: Add error handling for incomplete STT/TTS configuration.

The function silently returns {"model": ...} when neither voice (TTS) nor input_language/output_language (STT) are provided, masking incomplete configuration. Since STTLLMParams allows both language parameters to be None, this edge case is reachable despite upstream validation and should be caught explicitly.

♻️ Proposed fix
     elif input_language is not None or kaapi_params.get("output_language") is not None:
         # STT mode - map STTLLMParams
         ...
+    else:
+        # Neither TTS (voice) nor STT (input_language/output_language) params detected
+        return {}, ["Unable to determine STT or TTS mode: missing 'voice' for TTS or 'input_language'/'output_language' for STT"]
 
     return sarvam_params, warnings
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/mappers.py` around lines 191 - 207, The mapping
branch for STT/TTS currently proceeds silently when neither TTS `voice` nor STT
`input_language`/`output_language` are provided, so update the STT/TTS handling
in the block that sets `sarvam_params` (referencing `input_language`,
`kaapi_params.get("output_language")`, and `voice`) to validate configuration
and raise a clear exception (e.g., ValueError) or return an explicit error if
both TTS `voice` is falsy and both `input_language` and `output_language` are
None/empty; ensure the check occurs before computing
`transcription_mode`/`language_code` so incomplete STT/TTS configs are rejected
rather than returning only `{"model": ...}`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@backend/app/services/llm/mappers.py`:
- Around line 191-207: The mapping branch for STT/TTS currently proceeds
silently when neither TTS `voice` nor STT `input_language`/`output_language` are
provided, so update the STT/TTS handling in the block that sets `sarvam_params`
(referencing `input_language`, `kaapi_params.get("output_language")`, and
`voice`) to validate configuration and raise a clear exception (e.g.,
ValueError) or return an explicit error if both TTS `voice` is falsy and both
`input_language` and `output_language` are None/empty; ensure the check occurs
before computing `transcription_mode`/`language_code` so incomplete STT/TTS
configs are rejected rather than returning only `{"model": ...}`.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 318a94c and a61b158.

📒 Files selected for processing (2)
  • backend/app/services/llm/mappers.py
  • backend/app/services/llm/providers/sai.py

@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 99.23810% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/services/llm/providers/sai.py 95.87% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Prajna1999 Prajna1999 merged commit e28079c into main Mar 5, 2026
3 checks passed
@Prajna1999 Prajna1999 deleted the feature/stt-sarvam branch March 5, 2026 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TTS/STT: Integrate Sarvam as a Provider to the Unified API for STT and TTS

3 participants